7 research outputs found

    A fine-grained parallel dataflow-inspired architecture for streaming applications

    Get PDF
    Data driven streaming applications are quite common in modern multimedia and wireless applications, like for example video and audio processing. The main components of these applications are Digital Signal Processing (DSP) algorithms. These algorithms are not extremely complex in terms of their structure and the operations that make up the algorithms are fairly simple (usually binary mathematical operations like addition and multiplication). What makes it challenging to implement and execute these algorithms efficiently is their large degree of fine-grained parallelism and the required throughput. DSP algorithms can usually be described as dataflow graphs with nodes corresponding to operations and edges between the nodes expressing data dependencies. A node fires, i.e. executes, as soon as all required input data has arrived at its input edge(s). \ud \ud To execute DSP algorithms efficiently while maintaining flexibility, coarse-grained reconfigurable arrays (CGRAs) can be used. CGRAs are composed of a set of small, reconfigurable cores, interconnected in e.g. a two dimensional array. Each core by itself is not very powerful, yet the complete array of cores forms an efficient architecture with a high throughput due to its ability to efficiently execute operations in parallel. \ud \ud In this thesis, we present a CGRA targeted at data driven streaming DSP applications that contain a large degree of fine grained parallelism, such as matrix manipulations or filter algorithms. Along with the architecture, also a programming language is presented that can directly describe DSP applications as dataflow graphs which are then automatically mapped and executed on the architecture. In contrast to previously published work on CGRAs, the guiding principle and inspiration for the presented CGRA and its corresponding programming paradigm is the dataflow principle. \ud \ud The result of this work is a completely integrated framework targeted at streaming DSP algorithms, consisting of a CGRA, a programming language and a compiler. The complete system is based on dataflow principles. We conclude that by using an architecture that is based on dataflow principles and a corresponding programming paradigm that can directly express dataflow graphs, DSP algorithms can be implemented in a very intuitive and straightforward manner

    Fine-Grained ASIP Power Gating: Gate-level study of the break-even point for fine-grained power gating

    No full text
    Power consumption in portable electronic devices is a crucial design factor. While technology at 90~nm and above is still dominated by dynamic power, it is expected that leakage power will gain importance in sub-90~nm technologies. One commonly used technique to reduce leakage power is power gating, which is still an active research topic, especially on the fine-grained level.The purpose of this thesis was to explore the impact of fine-grained power gating on the datapath of a VLIW processor. Also, a detailed analysis of the savings versus introduced overhead was performed to derive a generic formula for a quick estimation of the energy efficiency of power gating. During the work, a work-flow to partition the system into power domains was developed. Furthermore, a verification method was implemented that validates whether a power gated resource is scheduled by the compiler or not. A configurable HSPICE simulation flow was implemented to determine how many power switches were required for a specific power domain as well as the energy consumption to switch a power domain on.Two processors with different usage profiles, designed with the help of different tools, were investigated in this thesis. The processors were modified to support power gating, and, furthermore, a synchronous power manager was developed. After RTL-level verification of functional correctness, the resulting systems were synthesised and placed and routed for 100 Mhz with two different 90nm TSMC libraries (low power and general purpose) to evaluate the variation between different technology flavours. The results showed, that a large contributor to the energy overhead of power gating is the dynamic power of additionally required modules, e.g., the isolation cells at the output of a power domain and the power manager. Also it was proven that the power domains need a very low duty cycle in order to apply power gating efficiently. It has been shown that the energy overhead for fine-grained power gating is significant and it is mainly caused by additional modules that have to be added to the system. Therefore, power gating can only be beneficial on designs with sufficient large power domains with a low duty cycle. However, it must be said that power is mainly consumed by the memories. Also, for 90~nm, leakage power is a rather small fraction of the total power consumption. The possible overall savings when focussing on the datapath are therefore very limited

    A Haskell-Based Programming Paradigm for Coarse-Grained Reconfigurable Arrays

    Get PDF
    Programming coarse-grain reconfigurable arrays (CGRAs) is a challenging task. In this work, we exploit the algebraic structure which is often present in the specification of a streaming application to distribute the different parts of a computation over a multi-core architecture. This architecture is dataflow-based, so that the control of the cores coincides with the availability of data on the channels. In this paper, we focus on the compiler, not the architecture. We formulate our work in the functional programming language Haskell since that is close to a mathematical formalism

    The Challenges of Implementing Fine-Grained Power Gating

    Get PDF
    Power consumption in digital systems, especially in portable devices, is a crucial design factor. Due to downscaling of technology, dynamic switching power is not the only relevant source of power consumption anymore as power dissipation caused by leakage currents increases. Even though power gating is a seemingly simple method for reducing the leakage power, the implications of introducing power gating to a design have to be analyzed in detail. We present an extensive analysis of the impact of fine-grained power gating on the overall power consumption. The presented results are based on the analysis of an actual implementation of power gating in the datapath of a very long instruction word (VLIW) processor. The extracted power consumption values clearly demonstrate that the overhead of power gating is, in contrary to the analysis found in previous publication, not determined by the energy required to switch a power domain on. Rather, it is determined by the energy consumption of additionally required modules. We show that, for the break-even point case, about 2/3 of the energy overhead is caused by the isolation cells, about 1/3 by the control modules, and only roughly 1% by the energy to switch a power domain on

    A comprehensive laboratory study on the immersion freezing behavior of illite NX particles : a comparison of 17 ice nucleation measurement techniques

    No full text
    Immersion freezing is the most relevant heterogeneous ice nucleation mechanism through which ice crystals are formed in mixed-phase clouds. In recent years, an increasing number of laboratory experiments utilizing a variety of instruments have examined immersion freezing activity of atmospherically relevant ice-nucleating particles. However, an intercomparison of these laboratory results is a difficult task because investigators have used different ice nucleation (IN) measurement methods to produce these results. A remaining challenge is to explore the sensitivity and accuracy of these techniques and to understand how the IN results are potentially influenced or biased by experimental parameters associated with these techniques. Within the framework of INUIT (Ice Nuclei Research Unit), we distributed an illite-rich sample (illite NX) as a representative surrogate for atmospheric mineral dust particles to investigators to perform immersion freezing experiments using different IN measurement methods and to obtain IN data as a function of particle concentration, temperature (T), cooling rate and nucleation time. A total of 17 measurement methods were involved in the data intercomparison. Experiments with seven instruments started with the test sample pre-suspended in water before cooling, while 10 other instruments employed water vapor condensation onto dry-dispersed particles followed by immersion freezing. The resulting comprehensive immersion freezing data set was evaluated using the ice nucleation active surface-site density, ns, to develop a representative ns(T) spectrum that spans a wide temperature range (−37 °C < T < −11 °C) and covers 9 orders of magnitude in ns. In general, the 17 immersion freezing measurement techniques deviate, within a range of about 8 °C in terms of temperature, by 3 orders of magnitude with respect to ns. In addition, we show evidence that the immersion freezing efficiency expressed in ns of illite NX particles is relatively independent of droplet size, particle mass in suspension, particle size and cooling rate during freezing. A strong temperature dependence and weak time and size dependence of the immersion freezing efficiency of illite-rich clay mineral particles enabled the ns parameterization solely as a function of temperature. We also characterized the ns(T) spectra and identified a section with a steep slope between −20 and −27 °C, where a large fraction of active sites of our test dust may trigger immersion freezing. This slope was followed by a region with a gentler slope at temperatures below −27 °C. While the agreement between different instruments was reasonable below ~ −27 °C, there seemed to be a different trend in the temperature-dependent ice nucleation activity from the suspension and dry-dispersed particle measurements for this mineral dust, in particular at higher temperatures. For instance, the ice nucleation activity expressed in ns was smaller for the average of the wet suspended samples and higher for the average of the dry-dispersed aerosol samples between about −27 and −18 °C. Only instruments making measurements with wet suspended samples were able to measure ice nucleation above −18 °C. A possible explanation for the deviation between −27 and −18 °C is discussed. Multiple exponential distribution fits in both linear and log space for both specific surface area-based ns(T) and geometric surface area-based ns(T) are provided. These new fits, constrained by using identical reference samples, will help to compare IN measurement methods that are not included in the present study and IN data from future IN instruments

    A comprehensive laboratory study on the immersion freezing behavior of illite NX particles : a comparison of seventeen ice nucleation measurement techniques

    Get PDF
    Immersion freezing is the most relevant heterogeneous ice nucleation mechanism through which ice crystals are formed in mixed-phase clouds. In recent years, an increasing number of laboratory experiments utilizing a variety of instruments have examined immersion freezing activity of atmospherically relevant ice nucleating particles (INPs). However, an inter-comparison of these laboratory results is a difficult task because investigators have used different ice nucleation (IN) measurement methods to produce these results. A remaining challenge is to explore the sensitivity and accuracy of these techniques and to understand how the IN results are potentially influenced or biased by experimental parameters associated with these techniques. Within the framework of INUIT (Ice Nucleation research UnIT), we distributed an illite rich sample (illite NX) as a representative surrogate for atmospheric mineral dust particles to investigators to perform immersion freezing experiments using different IN measurement methods and to obtain IN data as a function of particle concentration, temperature (T), cooling rate and nucleation time. Seventeen measurement methods were involved in the data inter-comparison. Experiments with seven instruments started with the test sample pre-suspended in water before cooling, while ten other instruments employed water vapor condensation onto dry-dispersed particles followed by immersion freezing. The resulting comprehensive immersion freezing dataset was evaluated using the ice nucleation active surface-site density (ns) to develop a representative ns(T) spectrum that spans a wide temperature range (−37 °C < T < −11 °C) and covers nine orders of magnitude in ns. Our inter-comparison results revealed a discrepancy between suspension and dry-dispersed particle measurements for this mineral dust. While the agreement was good below ~ −26 °C, the ice nucleation activity, expressed in ns, was smaller for the wet suspended samples and higher for the dry-dispersed aerosol samples between about −26 and −18 °C. Only instruments making measurement techniques with wet suspended samples were able to measure ice nucleation above −18 °C. A possible explanation for the deviation between −26 and −18 °C is discussed. In general, the seventeen immersion freezing measurement techniques deviate, within the range of about 7 °C in terms of temperature, by three orders of magnitude with respect to ns. In addition, we show evidence that the immersion freezing efficiency (i.e., ns) of illite NX particles is relatively independent on droplet size, particle mass in suspension, particle size and cooling rate during freezing. A strong temperature-dependence and weak time- and size-dependence of immersion freezing efficiency of illite-rich clay mineral particles enabled the ns parameterization solely as a function of temperature. We also characterized the ns (T) spectra, and identified a section with a steep slope between −20 and −27 °C, where a large fraction of active sites of our test dust may trigger immersion freezing. This slope was followed by a region with a gentler slope at temperatures below −27 °C. A multiple exponential distribution fit is expressed as ns(T) = exp(23.82 × exp(−exp(0.16 × (T + 17.49))) + 1.39) based on the specific surface area and ns(T) = exp(25.75 × exp(−exp(0.13 × (T + 17.17))) + 3.34) based on the geometric area (ns and T in m−2 and °C, respectively). These new fits, constrained by using an identical reference samples, will help to compare IN measurement methods that are not included in the present study and, thereby, IN data from future IN instruments
    corecore